1 research outputs found

    Learning efficient temporal information in deep networks: From the viewpoints of applications and modeling

    Get PDF
    With the introduction of deep learning, machine learning has dominated several technology areas, giving birth to high-performance applications that can even challenge human-level accuracy. However, the complexity of deep models is also exploding as a by-product of the revolution of machine learning. Such enormous model complexity has raised the new challenge of improving the efficiency in deep models to reduce deployment expense, especially for systems with high throughput demands or devices with limited power. The dissertation aims to improve the efficiency of temporal-sensitive deep models in four different directions. First, we develop a bandwidth extension mapping to avoid deploying multiple speech recognition systems corresponding to wideband and narrowband signals. Second, we apply a multi-modality approach to compensate for the performance of an excitement scoring system, where the input video sequences are aggressively down-sampled to reduce throughput. Third, we formulate the motion feature in the feature space by directly inducing the temporal information from intermediate layers of deep networks instead of relying on an additional optical flow stream. Finally, we model a spatiotemporal sampling network inspired by the human visual perception mechanism to reduce input frames and regions adaptively
    corecore